Graph Mining based on a Data Partitioning Approach

نویسندگان

  • Son N. Nguyen
  • Maria E. Orlowska
  • Xue Li
چکیده

Existing graph mining algorithms typically assume that the dataset can fit into main memory. As many large graph datasets cannot satisfy this condition, truly scalable graph mining remains a challenging computational problem. In this paper, we present a new horizontal data partitioning framework for graph mining. The original dataset is divided into fragments, then each fragment is mined individually and the results are combined together to generate a global result. One of the challenging problems in graph mining is about the completeness because the of complexity graph structures. We will prove the completeness of our algorithm in this paper. The experiments will be conducted to illustrate the efficiency of our data partitioning approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An effective multilevel tabu search approach for balanced graph partitioning

Graph partitioning is one of the fundamental NP-complete problems which is widely applied in many domains, such as VLSI design, image segmentation, data mining etc. Given a graph G = (V,E), the balanced k-partitioning problem consists in partitioning the vertex set V into k disjoint subsets of about the same size, such that the number of cutting edges is minimized. In this paper, we present a m...

متن کامل

Density-based data partitioning strategy to approximate large-scale subgraph mining

Recently, graph mining approaches have become very popular, especially in certain domains such as bioinformatics, chemoinformatics and social networks. One of the most challenging tasks is frequent subgraph discovery. This task has been highly motivated by the tremendously increasing size of existing graph databases. Due to this fact, there is an urgent need of efficient and scaling approaches ...

متن کامل

Effcient Mining of Heterogeneous Star-Structured Data

Many of the real world clustering problems arising in data mining applications are heterogeneous in nature. Heterogeneous co-clustering involves simultaneous clustering of objects of two or more data types. While pairwise co-clustering of two data types has been well studied in the literature, research on high-order heterogeneous co-clustering is still limited. In this paper, we propose a graph...

متن کامل

Customer Retention Based on the Number of Purchase: A Data Mining Approach

Purpose: this study wants to find any relationship between the numbers of purchase and the income the customer brings to the company. The attempt is to find those customers who buy more than one life insurance policy and represent the signs of good payments at the same time by the help of data mining tools. Design/ methodology/ approach: the approach of this research is to use data mining tools...

متن کامل

Rough Set Theory: Approach for Similarity Measure in Cluster Analysis

Clustering of data is an important data mining application. One of the problems with traditional partitioning clustering methods is that they partition the data into hard bound number of clusters. Rough set based Indiscernibility relation combined with indiscernibility graph, leads to knowledge discovery in an elegant way. Indiscernibilty relation has a strong appeal to be applied in clustering...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008